Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Int J Biol Macromol ; 260(Pt 2): 129570, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38246456

RESUMEN

Sodium lignosulfonate, an abundant natural resource, is regarded as an ideal precursor for the synthesis of hard carbon. The development of high-performance, low-cost and sustainable anode materials is a significant challenge facing lithium-ion batteries (LIBs). The modulation of morphology and defect structure during thermal transformation is crucial to improve Li+ storage behavior. Synthesized using sodium lignosulfonate as a precursor, two-dimensional carbon nanosheets with a high density of defects were produced. The synergistic influence of ice templates and KCl was leveraged, where the ice prevented clumping of potassium chloride during drying, and the latter served as a skeletal support during pyrolysis. This resulted in the formation of an interconnected two-dimensional nanosheet structure through the combined action of both templates. The optimized sample has a charging capacity of 712.4 mA h g-1 at 0.1 A g-1, which is contributed by the slope region. After 200 cycles at 0.2 A g-1, the specific charge capacity remains 514.4 mA h g-1, and a high specific charge capacity of 333.8 mA h g-1 after 800 cycles at 2 A g-1. The proposed investigation offers a promising approach for developing high-performance, low-cost carbon-based anode materials that could be used in advanced lithium-ion batteries.


Asunto(s)
Hielo , Lignina/análogos & derivados , Litio , Cristalización , Carbono
2.
medRxiv ; 2023 Oct 02.
Artículo en Inglés | MEDLINE | ID: mdl-37873131

RESUMEN

Though electronic health record (EHR) systems are a rich repository of clinical information with large potential, the use of EHR-based phenotyping algorithms is often hindered by inaccurate diagnostic records, the presence of many irrelevant features, and the requirement for a human-labeled training set. In this paper, we describe a knowledge-driven online multimodal automated phenotyping (KOMAP) system that i) generates a list of informative features by an online narrative and codified feature search engine (ONCE) and ii) enables the training of a multimodal phenotyping algorithm based on summary data. Powered by composite knowledge from multiple EHR sources, online article corpora, and a large language model, features selected by ONCE show high concordance with the state-of-the-art AI models (GPT4 and ChatGPT) and encourage large-scale phenotyping by providing a smaller but highly relevant feature set. Validation of the KOMAP system across four healthcare centers suggests that it can generate efficient phenotyping algorithms with robust performance. Compared to other methods requiring patient-level inputs and gold-standard labels, the fully online KOMAP provides a significant opportunity to enable multi-center collaboration.

3.
medRxiv ; 2023 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-37293026

RESUMEN

Objective: Electronic health record (EHR) systems contain a wealth of clinical data stored as both codified data and free-text narrative notes, covering hundreds of thousands of clinical concepts available for research and clinical care. The complex, massive, heterogeneous, and noisy nature of EHR data imposes significant challenges for feature representation, information extraction, and uncertainty quantification. To address these challenges, we proposed an efficient Aggregated naRrative Codified Health (ARCH) records analysis to generate a large-scale knowledge graph (KG) for a comprehensive set of EHR codified and narrative features. Methods: The ARCH algorithm first derives embedding vectors from a co-occurrence matrix of all EHR concepts and then generates cosine similarities along with associated p-values to measure the strength of relatedness between clinical features with statistical certainty quantification. In the final step, ARCH performs a sparse embedding regression to remove indirect linkage between entity pairs. We validated the clinical utility of the ARCH knowledge graph, generated from 12.5 million patients in the Veterans Affairs (VA) healthcare system, through downstream tasks including detecting known relationships between entity pairs, predicting drug side effects, disease phenotyping, as well as sub-typing Alzheimer's disease patients. Results: ARCH produces high-quality clinical embeddings and KG for over 60,000 EHR concepts, as visualized in the R-shiny powered web-API (https://celehs.hms.harvard.edu/ARCH/). The ARCH embeddings attained an average area under the ROC curve (AUC) of 0.926 and 0.861 for detecting pairs of similar EHR concepts when the concepts are mapped to codified data and to NLP data; and 0.810 (codified) and 0.843 (NLP) for detecting related pairs. Based on the p-values computed by ARCH, the sensitivity of detecting similar and related entity pairs are 0.906 and 0.888 under false discovery rate (FDR) control of 5%. For detecting drug side effects, the cosine similarity based on the ARCH semantic representations achieved an AUC of 0.723 while the AUC improved to 0.826 after few-shot training via minimizing the loss function on the training data set. Incorporating NLP data substantially improved the ability to detect side effects in the EHR. For example, based on unsupervised ARCH embeddings, the power of detecting drug-side effects pairs when using codified data only was 0.15, much lower than the power of 0.51 when using both codified and NLP concepts. Compared to existing large-scale representation learning methods including PubmedBERT, BioBERT and SAPBERT, ARCH attains the most robust performance and substantially higher accuracy in detecting these relationships. Incorporating ARCH selected features in weakly supervised phenotyping algorithms can improve the robustness of algorithm performance, especially for diseases that benefit from NLP features as supporting evidence. For example, the phenotyping algorithm for depression attained an AUC of 0.927 when using ARCH selected features but only 0.857 when using codified features selected via the KESER network[1]. In addition, embeddings and knowledge graphs generated from the ARCH network were able to cluster AD patients into two subgroups, where the fast progression subgroup had a much higher mortality rate. Conclusions: The proposed ARCH algorithm generates large-scale high-quality semantic representations and knowledge graph for both codified and NLP EHR features, useful for a wide range of predictive modeling tasks.

4.
Bioinformatics ; 39(2)2023 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-36805623

RESUMEN

MOTIVATION: Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance. METHODS: We introduce a Multi-Modal REpresentation Mapping Approach to Predicting molecular-disease relations (M2REMAP) by incorporating clinical semantics learned from electronic health records (EHR) of 12.6 million patients. Specifically, M2REMAP first learns a multimodal molecule representation that synthesizes chemical property and clinical semantic information by mapping molecule chemicals via a deep neural network onto the clinical semantic embedding space shared by drugs, diseases and other common clinical concepts. To infer molecule-disease relations, M2REMAP combines multimodal molecule representation and disease semantic embedding to jointly infer indications and side effects. RESULTS: We extensively evaluate M2REMAP on molecule indications, side effects and interactions. Results show that incorporating EHR embeddings improves performance significantly, for example, attaining an improvement over the baseline models by 23.6% in PRC-AUC on indications and 23.9% on side effects. Further, M2REMAP overcomes the limitation of existing methods and effectively predicts drugs for novel diseases and emerging pathogens. AVAILABILITY AND IMPLEMENTATION: The code is available at https://github.com/celehs/M2REMAP, and prediction results are provided at https://shiny.parse-health.org/drugs-diseases-dev/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Desarrollo de Medicamentos , Registros Electrónicos de Salud , Redes Neurales de la Computación , Farmacovigilancia
5.
Biostatistics ; 24(3): 760-775, 2023 Jul 14.
Artículo en Inglés | MEDLINE | ID: mdl-35166342

RESUMEN

Leveraging large-scale electronic health record (EHR) data to estimate survival curves for clinical events can enable more powerful risk estimation and comparative effectiveness research. However, use of EHR data is hindered by a lack of direct event time observations. Occurrence times of relevant diagnostic codes or target disease mentions in clinical notes are at best a good approximation of the true disease onset time. On the other hand, extracting precise information on the exact event time requires laborious manual chart review and is sometimes altogether infeasible due to a lack of detailed documentation. Current status labels-binary indicators of phenotype status during follow-up-are significantly more efficient and feasible to compile, enabling more precise survival curve estimation given limited resources. Existing survival analysis methods using current status labels focus almost entirely on supervised estimation, and naive incorporation of unlabeled data into these methods may lead to biased estimates. In this article, we propose Semisupervised Calibration of Risk with Noisy Event Times (SCORNET), which yields a consistent and efficient survival function estimator by leveraging a small set of current status labels and a large set of informative features. In addition to providing theoretical justification of SCORNET, we demonstrate in both simulation and real-world EHR settings that SCORNET achieves efficiency akin to the parametric Weibull regression model, while also exhibiting semi-nonparametric flexibility and relatively low empirical bias in a variety of generative settings.


Asunto(s)
Registros Electrónicos de Salud , Humanos , Calibración , Sesgo , Simulación por Computador
6.
J Biomed Inform ; 133: 104147, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-35872266

RESUMEN

OBJECTIVE: The growing availability of electronic health records (EHR) data opens opportunities for integrative analysis of multi-institutional EHR to produce generalizable knowledge. A key barrier to such integrative analyses is the lack of semantic interoperability across different institutions due to coding differences. We propose a Multiview Incomplete Knowledge Graph Integration (MIKGI) algorithm to integrate information from multiple sources with partially overlapping EHR concept codes to enable translations between healthcare systems. METHODS: The MIKGI algorithm combines knowledge graph information from (i) embeddings trained from the co-occurrence patterns of medical codes within each EHR system and (ii) semantic embeddings of the textual strings of all medical codes obtained from the Self-Aligning Pretrained BERT (SAPBERT) algorithm. Due to the heterogeneity in the coding across healthcare systems, each EHR source provides partial coverage of the available codes. MIKGI synthesizes the incomplete knowledge graphs derived from these multi-source embeddings by minimizing a spherical loss function that combines the pairwise directional similarities of embeddings computed from all available sources. MIKGI outputs harmonized semantic embedding vectors for all EHR codes, which improves the quality of the embeddings and enables direct assessment of both similarity and relatedness between any pair of codes from multiple healthcare systems. RESULTS: With EHR co-occurrence data from Veteran Affairs (VA) healthcare and Mass General Brigham (MGB), MIKGI algorithm produces high quality embeddings for a variety of downstream tasks including detecting known similar or related entity pairs and mapping VA local codes to the relevant EHR codes used at MGB. Based on the cosine similarity of the MIKGI trained embeddings, the AUC was 0.918 for detecting similar entity pairs and 0.809 for detecting related pairs. For cross-institutional medical code mapping, the top 1 and top 5 accuracy were 91.0% and 97.5% when mapping medication codes at VA to RxNorm medication codes at MGB; 59.1% and 75.8% when mapping VA local laboratory codes to LOINC hierarchy. When trained with 500 labels, the lab code mapping attained top 1 and 5 accuracy at 77.7% and 87.9%. MIKGI also attained best performance in selecting VA local lab codes for desired laboratory tests and COVID-19 related features for COVID EHR studies. Compared to existing methods, MIKGI attained the most robust performance with accuracy the highest or near the highest across all tasks. CONCLUSIONS: The proposed MIKGI algorithm can effectively integrate incomplete summary data from biomedical text and EHR data to generate harmonized embeddings for EHR codes for knowledge graph modeling and cross-institutional translation of EHR codes.


Asunto(s)
COVID-19 , Registros Electrónicos de Salud , Algoritmos , Humanos , Logical Observation Identifiers Names and Codes , Reconocimiento de Normas Patrones Automatizadas
7.
NPJ Digit Med ; 4(1): 151, 2021 Oct 27.
Artículo en Inglés | MEDLINE | ID: mdl-34707226

RESUMEN

The increasing availability of electronic health record (EHR) systems has created enormous potential for translational research. However, it is difficult to know all the relevant codes related to a phenotype due to the large number of codes available. Traditional data mining approaches often require the use of patient-level data, which hinders the ability to share data across institutions. In this project, we demonstrate that multi-center large-scale code embeddings can be used to efficiently identify relevant features related to a disease of interest. We constructed large-scale code embeddings for a wide range of codified concepts from EHRs from two large medical centers. We developed knowledge extraction via sparse embedding regression (KESER) for feature selection and integrative network analysis. We evaluated the quality of the code embeddings and assessed the performance of KESER in feature selection for eight diseases. Besides, we developed an integrated clinical knowledge map combining embedding data from both institutions. The features selected by KESER were comprehensive compared to lists of codified data generated by domain experts. Features identified via KESER resulted in comparable performance to those built upon features selected manually or with patient-level data. The knowledge map created using an integrative analysis identified disease-disease and disease-drug pairs more accurately compared to those identified using single institution data. Analysis of code embeddings via KESER can effectively reveal clinical knowledge and infer relatedness among codified concepts. KESER bypasses the need for patient-level data in individual analyses providing a significant advance in enabling multi-center studies using EHR data.

8.
Econom J ; 24(3): 559-588, 2021 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38223304

RESUMEN

We propose double/debiased machine learning approaches to infer a parametric component of a logistic partially linear model. Our framework is based on a Neyman orthogonal score equation consisting of two nuisance models for the nonparametric component of the logistic model and conditional mean of the exposure with the control group. To estimate the nuisance models, we separately consider the use of high dimensional (HD) sparse regression and (nonparametric) machine learning (ML) methods. In the HD case, we derive certain moment equations to calibrate the first order bias of the nuisance models, which preserves the model double robustness property. In the ML case, we handle the nonlinearity of the logit link through a novel and easy-to-implement 'full model refitting' procedure. We evaluate our methods through simulation and apply them in assessing the effect of the emergency contraceptive pill on early gestation and new births based on a 2008 policy reform in Chile.

9.
J Am Med Inform Assoc ; 27(8): 1235-1243, 2020 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-32548637

RESUMEN

OBJECTIVE: A major bottleneck hindering utilization of electronic health record data for translational research is the lack of precise phenotype labels. Chart review as well as rule-based and supervised phenotyping approaches require laborious expert input, hampering applicability to studies that require many phenotypes to be defined and labeled de novo. Though International Classification of Diseases codes are often used as surrogates for true labels in this setting, these sometimes suffer from poor specificity. We propose a fully automated topic modeling algorithm to simultaneously annotate multiple phenotypes. MATERIALS AND METHODS: Surrogate-guided ensemble latent Dirichlet allocation (sureLDA) is a label-free multidimensional phenotyping method. It first uses the PheNorm algorithm to initialize probabilities based on 2 surrogate features for each target phenotype, and then leverages these probabilities to constrain the LDA topic model to generate phenotype-specific topics. Finally, it combines phenotype-feature counts with surrogates via clustering ensemble to yield final phenotype probabilities. RESULTS: sureLDA achieves reliably high accuracy and precision across a range of simulated and real-world phenotypes. Its performance is robust to phenotype prevalence and relative informativeness of surogate vs nonsurrogate features. It also exhibits powerful feature selection properties. DISCUSSION: sureLDA combines attractive properties of PheNorm and LDA to achieve high accuracy and precision robust to diverse phenotype characteristics. It offers particular improvement for phenotypes insufficiently captured by a few surrogate features. Moreover, sureLDA's feature selection ability enables it to handle high feature dimensions and produce interpretable computational phenotypes. CONCLUSIONS: sureLDA is well suited toward large-scale electronic health record phenotyping for highly multiphenotype applications such as phenome-wide association studies .


Asunto(s)
Algoritmos , Registros Electrónicos de Salud , Procesamiento de Lenguaje Natural , Registros Electrónicos de Salud/clasificación , Humanos , Medicina de Precisión , Curva ROC , Investigación Biomédica Traslacional
10.
Sci Rep ; 8(1): 622, 2018 01 12.
Artículo en Inglés | MEDLINE | ID: mdl-29330528

RESUMEN

Investigating how genes jointly affect complex human diseases is important, yet challenging. The network approach (e.g., weighted gene co-expression network analysis (WGCNA)) is a powerful tool. However, genomic data usually contain substantial batch effects, which could mask true genomic signals. Paired design is a powerful tool that can reduce batch effects. However, it is currently unclear how to appropriately apply WGCNA to genomic data from paired design. In this paper, we modified the current WGCNA pipeline to analyse high-throughput genomic data from paired design. We illustrated the modified WGCNA pipeline by analysing the miRNA dataset provided by Shiah et al. (2014), which contains forty oral squamous cell carcinoma (OSCC) specimens and their matched non-tumourous epithelial counterparts. OSCC is the sixth most common cancer worldwide. The modified WGCNA pipeline identified two sets of novel miRNAs associated with OSCC, in addition to the existing miRNAs reported by Shiah et al. (2014). Thus, this work will be of great interest to readers of various scientific disciplines, in particular, genetic and genomic scientists as well as medical scientists working on cancer.


Asunto(s)
Biología Computacional/métodos , Redes Reguladoras de Genes , Neoplasias/genética , Bases de Datos Genéticas , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Humanos , Análisis de Secuencia de ADN
11.
Front Biosci (Landmark Ed) ; 18(2): 588-97, 2013 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-23276944

RESUMEN

Although microRNAs (miRNAs) have been implicated in fine-tuning gene networks, the roles of mmu-mir-143 (miR-143) in mammalian ovary development have not been studied in vitro. We investigated the expression and function of miR-143 in the mouse ovary during primordial follicle formation. Real-time polymerase chain reaction analysis showed that miR-143 expression increased during primordial follicle formation from 15.5 days post-coitus to 4 days post-partum. miR-143 was located in pregranulosa cells by in situ hybridization. To study the function of miR-143 in primordial follicle formation we established an electroporation transfection model in vitro that allowed miR-143 expression to be efficiently upregulated and inhibited in cultured ovaries. Further studies showed that miR-143 inhibited the formation of primordial follicles by suppressing pregranulosa cell proliferation and downregulating the expression of genes related to the cell cycle. These findings suggest that miR-143 is critical for the formation of primordial follicles and regulates ovarian development and function.


Asunto(s)
MicroARNs/fisiología , Folículo Ovárico/fisiología , Animales , Proteínas de Ciclo Celular/biosíntesis , Proteínas de Ciclo Celular/efectos de los fármacos , Femenino , Masculino , Ratones , MicroARNs/biosíntesis , Folículo Ovárico/efectos de los fármacos , Ovario/crecimiento & desarrollo
12.
PLoS One ; 7(3): e33861, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22479460

RESUMEN

BACKGROUND: MicroRNAs (miRNAs) play vital regulatory roles in many cellular processes. The expression of miRNA (miR)-34c is highly enriched in adult mouse testis, but its roles and underlying mechanisms of action are not well understood. METHODOLOGY/PRINCIPAL FINDINGS: In the present study, we show that miR-34c is detected in mouse pachytene spermatocytes and continues to be highly expressed in spermatids. To explore the specific functions of miR-34c, we have established an in vivo model by transfecting miR-34c inhibitors into primary spermatocytes to study the loss-of-function of miR-34c. The results show that silencing of miR-34c significantly increases the Bcl-2/Bax ratio and prevents germ cell from apoptosis induced by deprivation of testosterone. Moreover, ectopic expression of the miR-34c in GC-2 cell trigger the cell apoptosis with a decreased Bcl-2/Bax ratio and miR-34c inhibition lead to a low spontaneous apoptotic ratio and an increased Bcl-2/Bax ratio. Furthermore, ectopic expression of miR-34c reduces ATF1 protein expression without affecting ATF1 mRNA level via directly binding to ATF1's 3'UTR, indicating that ATF1 is one of miR-34c's target genes. Meanwhile, the knockdown of ATF1 significantly decreases the Bcl-2/Bax ratio and triggers GC-2 cell apoptosis. Inhibition of miR-34c does not decrease the GC-2 cell apoptosis ratio in ATF1 knockdown cells. CONCLUSIONS/SIGNIFICANCE: Our study shows for the first time that miR-34c functions, at least partially, by targeting the ATF1 gene in germ cell apoptosis, providing a novel mechanism with involvement of miRNA in the regulation of germ cell apoptosis.


Asunto(s)
Factor de Transcripción Activador 1/genética , Apoptosis/genética , MicroARNs/metabolismo , Espermatozoides/metabolismo , Animales , Apoptosis/efectos de los fármacos , Secuencia de Bases , Línea Celular , Resistencia a Medicamentos/genética , Femenino , Flutamida/farmacología , Regulación del Desarrollo de la Expresión Génica , Orden Génico , Silenciador del Gen , Masculino , Ratones , Datos de Secuencia Molecular , Espermatozoides/efectos de los fármacos , Testículo/embriología , Testículo/metabolismo , Proteína X Asociada a bcl-2/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...